Range-Clustering Queries
نویسندگان
چکیده
In a geometric k-clustering problem the goal is to partition a set of points in R into k subsets such that a certain cost function of the clustering is minimized. We present data structures for orthogonal range-clustering queries on a point set S: given a query box Q and an integer k > 2, compute an optimal k-clustering for S ∩Q. We obtain the following results. – We present a general method to compute a (1 + ε)-approximation to a range-clustering query, where ε > 0 is a parameter that can be specified as part of the query. Our method applies to a large class of clustering problems, including k-center clustering in any Lp-metric and a variant of k-center clustering where the goal is to minimize the sum (instead of maximum) of the cluster sizes. – We extend our method to deal with capacitated k-clustering problems, where each of the clusters should not contain more than a given number of points. – For the special cases of rectilinear k-center clustering in R, and in R for k = 2 or 3, we present data structures that answer range-clustering queries exactly.
منابع مشابه
بهبود الگوریتم انتخاب دید در پایگاه داده تحلیلی با استفاده از یافتن پرس وجوهای پرتکرار
A data warehouse is a source for storing historical data to support decision making. Usually analytic queries take much time. To solve response time problem it should be materialized some views to answer all queries in minimum response time. There are many solutions for view selection problems. The most appropriate solution for view selection is materializing frequent queries. Previously posed ...
متن کاملA Clustered Dwarf Structure to Speed Up Queries on Data Cubes
Dwarf is a highly compressed structure, which compresses the cube by eliminating the semantic redundancies while computing a data cube. Although it has high compression ratio, Dwarf is slower in querying and more difficult in updating due to its structure characteristics. We all know that the original intention of data cube is to speed up the query performance, so we propose two novel clusterin...
متن کاملApproximate Range Queries for Clustering
We study the approximate range searching for three variants of the clustering problem with a set P of n points in d-dimensional Euclidean space and axis-parallel rectangular range queries: the k-median, k-means, and k-center range-clustering query problems. We present data structures and query algorithms that compute (1 + ε)-approximations to the optimal clusterings of P ∩Q efficiently for a qu...
متن کاملCost-based query-adaptive clustering for multidimensional objects with spatial extents. (Groupement d'Objets Multidimensionnels Etendus avec un Modèle de Coût Adaptatif aux Requêtes)
We propose a cost-based query-adaptive clustering solution for multidimen-sional objects with spatial extents to speed-up execution of spatial range queries (e.g.,intersection, containment). Our work was motivated by the emergence of many SDIapplications (Selective Dissemination of Information) bringing out new real challengesfor the multidimensional data indexing. Our clusterin...
متن کاملCCIndex: A Complemental Clustering Index on Distributed Ordered Tables for Multi-dimensional Range Queries
Massive scale distributed database like Google’s BigTable and Yahoo!’s PNUTS can be modeled as Distributed Ordered Table, or DOT, which partitions data regions and supports range queries on key. Multidimensional range queries on DOTs are fundamental requirements; however, none of existing schemes work well while considering three critical issues: high performance, low space overhead, and high r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017